Voice Over IP
   HOME

TheInfoList



OR:

Voice over Internet Protocol (VoIP), also called IP telephony, is a method and group of technologies for the delivery of
voice communication Speech is a human vocal communication using language. Each language uses phonetic combinations of vowel and consonant sounds that form the sound of its words (that is, all English words sound different from all French words, even if they are th ...
s and
multimedia Multimedia is a form of communication that uses a combination of different content forms such as text, audio, images, animations, or video into a single interactive presentation, in contrast to tradition ...
sessions over
Internet Protocol The Internet Protocol (IP) is the network layer communications protocol in the Internet protocol suite for relaying datagrams across network boundaries. Its routing function enables internetworking, and essentially establishes the Internet. IP h ...
(IP) networks, such as the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
. The terms Internet telephony, broadband telephony, and broadband phone service specifically refer to the provisioning of communications services (voice,
fax Fax (short for facsimile), sometimes called telecopying or telefax (the latter short for telefacsimile), is the telephonic transmission of scanned printed material (both text and images), normally to a telephone number connected to a printer o ...
,
SMS Short Message/Messaging Service, commonly abbreviated as SMS, is a text messaging service component of most telephone, Internet and mobile device systems. It uses standardized communication protocols that let mobile devices exchange short text ...
, voice-messaging) over the Internet, rather than via the
public switched telephone network The public switched telephone network (PSTN) provides Communications infrastructure, infrastructure and services for public Telecommunications, telecommunication. The PSTN is the aggregate of the world's circuit-switched telephone networks that ...
(PSTN), also known as
plain old telephone service Plain old telephone service (POTS), or plain ordinary telephone system, is a retronym for voice-grade telephone service employing analog signal transmission over copper loops. POTS was the standard service offering from telephone companies from 1 ...
(POTS).


Overview

The steps and principles involved in originating VoIP telephone calls are similar to traditional digital
telephony Telephony ( ) is the field of technology involving the development, application, and deployment of telecommunication services for the purpose of electronic transmission of voice, fax, or data, between distant parties. The history of telephony is i ...
and involve signaling, channel setup, digitization of the analog voice signals, and encoding. Instead of being transmitted over a
circuit-switched network Circuit switching is a method of implementing a telecommunications network in which two network nodes establish a dedicated communications channel ( circuit) through the network before the nodes may communicate. The circuit guarantees the full ...
, the digital information is packetized and transmission occurs as IP packets over a
packet-switched network In telecommunications, packet switching is a method of grouping data into '' packets'' that are transmitted over a digital network. Packets are made of a header and a payload. Data in the header is used by networking hardware to direct the pack ...
. They transport media streams using special media delivery protocols that encode audio and video with
audio codec An audio codec is a device or computer program capable of encoding or decoding a digital data stream (a codec) that encodes or decodes audio. In software, an audio codec is a computer program implementing an algorithm that compresses and decompres ...
s and
video codec A video codec is software or hardware that compresses and decompresses digital video. In the context of video compression, ''codec'' is a portmanteau of ''encoder'' and ''decoder'', while a device that only compresses is typically called an '' ...
s. Various codecs exist that optimize the media stream based on application requirements and network bandwidth; some implementations rely on
narrowband Narrowband signals are signals that occupy a narrow range of frequencies or that have a small fractional bandwidth. In the audio spectrum, narrowband sounds are sounds that occupy a narrow range of frequencies. In telephony, narrowband is usua ...
and compressed speech, while others support high-fidelity stereo codecs. The most widely used
speech coding Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...
standards in VoIP are based on the
linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
(LPC) and
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT) compression methods. Popular codecs include the MDCT-based
AAC-LD The MPEG-4 Low Delay Audio Coder (a.k.a. AAC Low Delay, or AAC-LD) is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the ...
(used in
FaceTime FaceTime is a Proprietary software, proprietary videotelephony product developed by Apple Inc. FaceTime is available on supported iOS mobile devices running iOS 4 and later and Mac computers that run and later. FaceTime supports any iOS devic ...
), the LPC/MDCT-based
Opus ''Opus'' (pl. ''opera'') is a Latin word meaning "work". Italian equivalents are ''opera'' (singular) and ''opere'' (pl.). Opus or OPUS may refer to: Arts and entertainment Music * Opus number, (abbr. Op.) specifying order of (usually) publicatio ...
(used in
WhatsApp WhatsApp (also called WhatsApp Messenger) is an internationally available freeware, cross-platform, centralized instant messaging (IM) and voice-over-IP (VoIP) service owned by American company Meta Platforms (formerly Facebook). It allows us ...
), the LPC-based
SILK Silk is a natural protein fiber, some forms of which can be woven into textiles. The protein fiber of silk is composed mainly of fibroin and is produced by certain insect larvae to form cocoons. The best-known silk is obtained from the coc ...
(used in
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
), μ-law and
A-law An A-law algorithm is a standard companding algorithm, used in European 8-bit PCM digital communications systems to optimize, i.e. modify, the dynamic range of an analog signal for digitizing. It is one of two versions of the G.711 standar ...
versions of
G.711 G.711 is a narrowband audio codec originally designed for use in telephony that provides toll-quality audio at 64 kbit/s. G.711 passes audio signals in the range of 300–3400 Hz and samples them at the rate of 8,000 samples per second ...
,
G.722 G.722 is an ITU-T standard 7 kHz wideband audio codec operating at 48, 56 and 64 kbit/s. It was approved by ITU-T in November 1988. Technology of the codec is based on sub-band ADPCM (SB-ADPCM). The corresponding narrow-band codec based on ...
, and an
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
voice codec known as
iLBC Internet Low Bitrate Codec (iLBC) is a royalty-free narrowband speech audio coding format and an open-source reference implementation (codec), developed by Global IP Solutions (GIPS) formerly Global IP Sound (acquired by Google Inc in 2011). ...
, a codec that uses only 8 kbit/s each way called
G.729 G.729 is a royalty-free narrow-band vocoder-based audio data compression algorithm using a frame length of 10 milliseconds. It is officially described as ''Coding of speech at 8 kbit/s using code-excited linear prediction'' speech coding (CS-ACEL ...
. Early providers of voice-over-IP services used business models and offered technical solutions that mirrored the architecture of the legacy telephone network. Second-generation providers, such as
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
, built closed networks for private user bases, offering the benefit of free calls and convenience while potentially charging for access to other communication networks, such as the PSTN. This limited the freedom of users to mix-and-match third-party hardware and software. Third-generation providers, such as
Google Talk Google Talk was an Instant messaging, instant messaging service that provided both text and voice communication. The instant messaging service was variously referred to colloquially as Gchat, Gtalk, or Gmessage among its users. Google Talk was ...
, adopted the concept of
federated VoIP Federated VoIP is a form of packetized voice telephony that uses voice over IP between autonomous domains in the public Internet without the deployment of central virtual exchange points or switching centers for traffic routing. Federated VoIP uses ...
. These solutions typically allow dynamic interconnection between users in any two domains of the Internet, when a user wishes to place a call. In addition to
VoIP phone A VoIP phone or IP phone uses voice over IP technologies for placing and transmitting telephone calls over an IP network, such as the Internet. This is in contrast to a standard phone which uses the traditional public switched telephone network ...
s, VoIP is also available on many personal computers and other Internet access devices. Calls and SMS text messages may be sent via
Wi-Fi Wi-Fi () is a family of wireless network protocols, based on the IEEE 802.11 family of standards, which are commonly used for local area networking of devices and Internet access, allowing nearby digital devices to exchange data by radio wave ...
or the carrier's
mobile data Mobile broadband is the marketing term for wireless Internet access via mobile networks. Access to the network can be made through a portable modem, wireless modem, or a tablet/ smartphone (possibly tethered) or other mobile device. The fi ...
network. VoIP provides a framework for consolidation of all modern communications technologies using a single
unified communications Unified communications (UC) is a business and marketing concept describing the integration of enterprise communication services such as instant messaging (chat), presence information, voice (including IP telephony), mobility features (including e ...
system.


Pronunciation

''VoIP'' is variously pronounced as an
initialism An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
, ''V-O-I-P'', or as an
acronym An acronym is a word or name formed from the initial components of a longer name or phrase. Acronyms are usually formed from the initial letters of words, as in ''NATO'' (''North Atlantic Treaty Organization''), but sometimes use syllables, as ...
, (). Full words, ''voice over Internet Protocol'', or ''voice over IP'', are sometimes used.


Protocols

Voice over IP has been implemented with
proprietary protocol In telecommunications, a proprietary protocol is a communications protocol owned by a single organization or individual. Intellectual property rights and enforcement Ownership by a single organization gives the owner the ability to place restricti ...
s and protocols based on
open standards An open standard is a standard that is openly accessible and usable by anyone. It is also a prerequisite to use open license, non-discrimination and extensibility. Typically, anybody can participate in the development. There is no single definition ...
in applications such as VoIP phones, mobile applications, and web-based communications. A variety of functions are needed to implement VoIP communication. Some protocols perform multiple functions, while others perform only a few and must be used in concert. These functions include: * ''Network'' and ''transport'' – Creating reliable transmission over unreliable protocols, which may involve acknowledging receipt of data and retransmitting data that wasn't received. * ''Session management'' – Creating and managing a session (sometimes glossed as simply a "call"), which is a connection between two or more peers that provides a context for further communication. * ''
Signaling In signal processing, a signal is a function that conveys information about a phenomenon. Any quantity that can vary over space or time can be used as a signal to share messages between observers. The ''IEEE Transactions on Signal Processing'' ...
'' – Performing registration (advertising one's presence and contact information) and discovery (locating someone and obtaining their contact information), dialing (including reporting call progress), negotiating capabilities, and call control (such as hold, mute, transfer/forwarding, dialing DTMF keys during a call .g._to_interact_with_an_automated_attendant_or_Interactive_voice_response.html" "title="automated_attendant.html" ;"title=".g. to interact with an automated attendant">.g. to interact with an automated attendant or Interactive voice response">IVR Interactive voice response (IVR) is a technology that allows telephone users to interact with a computer-operated telephone system through the use of voice and DTMF tones input with a keypad. In telecommunications, IVR allows customers to interac ...
], etc.). * ''Media description'' – Determining what type of media to send (audio, video, etc.), how to encode/decode it, and how to send/receive it (IP addresses, ports, etc.). * ''Media'' – Transferring the actual media in the call, such as audio, video, text messages, files, etc. * ''Quality of service'' – Providing out-of-band content or feedback about the media such as
synchronization Synchronization is the coordination of events to operate a system in unison. For example, the conductor of an orchestra keeps the orchestra synchronized or ''in time''. Systems that operate with all parts in synchrony are said to be synchronou ...
, statistics, etc. * ''Security'' – Implementing access control, verifying the identity of other participants (computers or people), and encrypting data to protect the privacy and integrity of the media contents and/or the control messages. VoIP protocols include: *
Session Initiation Protocol The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telepho ...
(SIP), connection management protocol developed by the IETF *
H.323 H.323 is a recommendation from the ITU Telecommunication Standardization Sector (ITU-T) that defines the protocols to provide audio-visual communication sessions on any packet network. The H.323 standard addresses call signaling and control, m ...
, one of the first VoIP call signaling and control protocols that found widespread implementation. Since the development of newer, less complex protocols such as MGCP and SIP, H.323 deployments are increasingly limited to carrying existing long-haul network traffic. *
Media Gateway Control Protocol The Media Gateway Control Protocol (MGCP) is a signaling and call control communication protocol used in voice over IP (VoIP) telecommunication systems. It implements the media gateway control protocol architecture for controlling media gatewa ...
(MGCP), connection management for media gateways *
H.248 The Gateway Control Protocol (Megaco, H.248) is an implementation of the media gateway control protocol architecture for providing telecommunication services across a converged internetwork consisting of the traditional public switched telephone ...
, control protocol for media gateways across a converged internetwork consisting of the traditional PSTN and modern packet networks *
Real-time Transport Protocol The Real-time Transport Protocol (RTP) is a network protocol for delivering audio and video over IP networks. RTP is used in communication and entertainment systems that involve streaming media, such as telephony, video teleconference applicatio ...
(RTP), transport protocol for real-time audio and video data *
Real-time Transport Control Protocol The RTP Control Protocol (RTCP) is a sister protocol of the Real-time Transport Protocol (RTP). Its basic functionality and packet structure is defined in RFC 3550. RTCP provides out-of-band statistics and control information for an RTP session. ...
(RTCP), sister protocol for RTP providing stream statistics and status information *
Secure Real-time Transport Protocol The Secure Real-time Transport Protocol (SRTP) is a profile for Real-time Transport Protocol (RTP) intended to provide encryption, message authentication and integrity, and replay attack protection to the RTP data in both unicast and multicas ...
(SRTP), encrypted version of RTP *
Session Description Protocol The Session Description Protocol (SDP) is a format for describing multimedia communication sessions for the purposes of announcement and invitation. Its predominant use is in support of streaming media applications, such as voice over IP (VoIP) ...
(SDP), a syntax for session initiation and announcement for multi-media communications and
WebSocket WebSocket is a computer communications protocol, providing full-duplex communication channels over a single TCP connection. The WebSocket protocol was standardized by the IETF as in 2011. The current API specification allowing web applications ...
transports. *
Inter-Asterisk eXchange Inter-Asterisk eXchange (IAX) is a communications protocol native to the Asterisk private branch exchange (PBX) software, and is supported by a few other softswitches, PBX systems, and softphones. It is used for transporting VoIP telephony session ...
(IAX), protocol used between
Asterisk PBX Asterisk is a software implementation of a private branch exchange (PBX). In conjunction with suitable telephony hardware interfaces and network applications, Asterisk is used to establish and control telephone calls between telecommunication e ...
instances *
Extensible Messaging and Presence Protocol Extensible Messaging and Presence Protocol (XMPP, originally named Jabber) is an Open standard, open communication protocol designed for instant messaging (IM), presence information, and contact list maintenance. Based on XML (Extensible Markup ...
(XMPP), instant messaging, presence information, and contact list maintenance *
Jingle A jingle is a short song or tune used in advertising and for other commercial uses. Jingles are a form of sound branding. A jingle contains one or more hooks and meaning that explicitly promote the product or service being advertised, usually t ...
, for peer-to-peer session control in XMPP * Skype protocol, proprietary Internet telephony protocol suite based on peer-to-peer architecture


Adoption


Consumer market

Mass-market VoIP services use existing
broadband Internet access Internet access is the ability of individuals and organizations to connect to the Internet using computer terminals, computers, and other devices; and to access services such as email and the World Wide Web. Internet access is sold by Internet ...
, by which subscribers place and receive telephone calls in much the same manner as they would via the PSTN. Full-service VoIP phone companies provide inbound and outbound service with
direct inbound dialing Direct inward dialing (DID), also called direct dial-in (DDI) in Europe and Oceania, is a telecommunication service offered by telephone companies to subscribers who operate a private branch exchange (PBX) system. The feature provides service for ...
. Many offer unlimited domestic calling and sometimes international calls for a flat monthly subscription fee. Phone calls between subscribers of the same provider are usually free when flat-fee service is not available. A VoIP phone is necessary to connect to a VoIP service provider. This can be implemented in several ways: * Dedicated VoIP phones connect directly to the IP network using technologies such as wired
Ethernet Ethernet () is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 198 ...
or
Wi-Fi Wi-Fi () is a family of wireless network protocols, based on the IEEE 802.11 family of standards, which are commonly used for local area networking of devices and Internet access, allowing nearby digital devices to exchange data by radio wave ...
. These are typically designed in the style of traditional digital business telephones. * An
analog telephone adapter An analog telephone adapter (ATA) is a device for connecting traditional analog telephones, fax machines, and similar customer-premises devices to a digital telephone system or a voice over IP telephony network. An ATA is often built into a sma ...
connects to the network and implements the electronics and firmware to operate a conventional analog telephone attached through a modular phone jack. Some residential Internet gateways and
cablemodem A cable modem is a type of network bridge that provides bi-directional data communication via radio frequency channels on a hybrid fibre-coaxial (HFC), radio frequency over glass (RFoG) and coaxial cable infrastructure. Cable modems are primari ...
s have this function built in. *
Softphone A softphone is a software program for making telephone calls over the Internet using a general purpose computer rather than dedicated hardware. The softphone can be installed on a piece of equipment such as a desktop, mobile device, or other comp ...
application software installed on a networked computer that is equipped with a microphone and speaker, or headset. The application typically presents a dial pad and display field to the user to operate the application by mouse clicks or keyboard input.


PSTN and mobile network providers

It is increasingly common for telecommunications providers to use VoIP telephony over dedicated and public IP networks as a backhaul to connect switching centers and to interconnect with other telephony network providers; this is often referred to as ''IP backhaul''.
Smartphones A smartphone is a portable computer device that combines mobile telephone and computing functions into one unit. They are distinguished from feature phones by their stronger hardware capabilities and extensive mobile operating systems, which ...
may have SIP clients built into the firmware or available as an application download.


Corporate use

Because of the bandwidth efficiency and low costs that VoIP technology can provide, businesses are migrating from traditional copper-wire telephone systems to VoIP systems to reduce their monthly phone costs. In 2008, 80% of all new
Private branch exchange A business telephone system is a multiline telephone system typically used in business environments, encompassing systems ranging in technology from the key telephone system (KTS) to the private branch exchange (PBX). A business telephone syst ...
(PBX) lines installed internationally were VoIP. For example, in the United States, the
Social Security Administration The United States Social Security Administration (SSA) is an Independent agencies of the United States government, independent agency of the Federal government of the United States, U.S. federal government that administers Social Security (United ...
is converting its field offices of 63,000 workers from traditional phone installations to a VoIP infrastructure carried over its existing data network. VoIP allows both voice and data communications to be run over a single network, which can significantly reduce infrastructure costs. The prices of extensions on VoIP are lower than for PBX and key systems. VoIP switches may run on commodity hardware, such as
personal computer A personal computer (PC) is a multi-purpose microcomputer whose size, capabilities, and price make it feasible for individual use. Personal computers are intended to be operated directly by an end user, rather than by a computer expert or tec ...
s. Rather than closed architectures, these devices rely on standard interfaces. VoIP devices have simple, intuitive user interfaces, so users can often make simple system configuration changes. Dual-mode phones enable users to continue their conversations as they move between an outside cellular service and an internal
Wi-Fi Wi-Fi () is a family of wireless network protocols, based on the IEEE 802.11 family of standards, which are commonly used for local area networking of devices and Internet access, allowing nearby digital devices to exchange data by radio wave ...
network, so that it is no longer necessary to carry both a desktop phone and a cell phone. Maintenance becomes simpler as there are fewer devices to oversee. VoIP solutions aimed at businesses have evolved into
unified communications Unified communications (UC) is a business and marketing concept describing the integration of enterprise communication services such as instant messaging (chat), presence information, voice (including IP telephony), mobility features (including e ...
services that treat all communications—phone calls, faxes, voice mail, e-mail, web conferences, and more—as discrete units that can all be delivered via any means and to any handset, including cellphones. Two kinds of service providers are operating in this space: one set is focused on VoIP for medium to large enterprises, while another is targeting the small-to-medium business (SMB) market.
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
, which originally marketed itself as a service among friends, has begun to cater to businesses, providing free-of-charge connections between any users on the Skype network and connecting to and from ordinary PSTN telephones for a charge.


Delivery mechanisms

In general, the provision of VoIP telephony systems to organizational or individual users can be divided into two primary delivery methods: private or on-premises solutions, or externally hosted solutions delivered by third-party providers. On-premises delivery methods are more akin to the classic PBX deployment model for connecting an office to local PSTN networks. While many use cases still remain for private or on-premises VoIP systems, the wider market has been gradually shifting toward ''Cloud'' or ''Hosted'' VoIP solutions. Hosted systems are also generally better suited to smaller or personal use VoIP deployments, where a private system may not be viable for these scenarios.


Hosted VoIP systems

''Hosted'' or ''Cloud'' VoIP solutions involve a service provider or telecommunications carrier hosting the telephone system as a software solution within their own infrastructure. Typically this will be one or more datacentres, with geographic relevance to the end-user(s) of the system. This infrastructure is external to the user of the system and is deployed and maintained by the service provider. Endpoints, such as VoIP telephones or softphone applications (apps running on a computer or mobile device), will connect to the VoIP service remotely. These connections typically take place over public internet links, such as local fixed WAN breakout or mobile carrier service.


Private VoIP systems

In the case of a private VoIP system, the primary telephony system itself is located within the private infrastructure of the end-user organization. Usually, the system will be deployed on-premises at a site within the direct control of the organization. This can provide numerous benefits in terms of QoS control (see below), cost scalability, and ensuring privacy and security of communications traffic. However, the responsibility for ensuring that the VoIP system remains performant and resilient is predominantly vested in the end-user organization. This is not the case with a Hosted VoIP solution. Private VoIP systems can be physical hardware PBX appliances, converged with other infrastructure, or they can be deployed as software applications. Generally, the latter two options will be in the form of a separate virtualized appliance. However, in some scenarios, these systems are deployed on bare metal infrastructure or IoT devices. With some solutions, such as 3CX, companies can attempt to blend the benefits of hosted and private on-premises systems by implementing their own private solution but within an external environment. Examples can include
datacentre A data center (American English) or data centre (British English)See spelling differences. is a building, a dedicated space within a building, or a group of buildings used to house computer systems and associated components, such as telecommunic ...
collocation services, public cloud, or private cloud locations. For on-premises systems, local endpoints within the same location typically connect directly over the LAN. For remote and external endpoints, available connectivity options mirror those of Hosted or Cloud VoIP solutions. However, VoIP traffic to and from the on-premises systems can often also be sent over secure private links. Examples include personal VPN, site-to-site VPN, private networks such as MPLS and SD-WAN, or via private SBCs (Session Border Controllers). While exceptions and private peering options do exist, it is generally uncommon for those private connectivity methods to be provided by Hosted or Cloud VoIP providers.


Quality of service

Communication on the IP network is perceived as less reliable in contrast to the circuit-switched public telephone network because it does not provide a network-based mechanism to ensure that data packets are not lost, and are delivered in sequential order. It is a best-effort network without fundamental
quality of service Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitat ...
(QoS) guarantees. Voice, and all other data, travels in packets over IP networks with fixed maximum capacity. This system may be more prone to data loss in the presence of congestion than traditional
circuit switched Circuit switching is a method of implementing a telecommunications network in which two network nodes establish a dedicated communications channel ( circuit) through the network before the nodes may communicate. The circuit guarantees the full ...
systems; a circuit switched system of insufficient capacity will refuse new connections while carrying the remainder without impairment, while the quality of real-time data such as telephone conversations on packet-switched networks degrades dramatically. Therefore, VoIP implementations may face problems with latency, packet loss, and
jitter In electronics and telecommunications, jitter is the deviation from true periodicity of a presumably periodic signal, often in relation to a reference clock signal. In clock recovery applications it is called timing jitter. Jitter is a significa ...
. By default, network routers handle traffic on a first-come, first-served basis. Fixed delays cannot be controlled as they are caused by the physical distance the packets travel. They are especially problematic when satellite circuits are involved because of the long distance to a
geostationary satellite A geostationary orbit, also referred to as a geosynchronous equatorial orbit''Geostationary orbit'' and ''Geosynchronous (equatorial) orbit'' are used somewhat interchangeably in sources. (GEO), is a circular geosynchronous orbit in altitude ...
and back; delays of 400–600 ms are typical. Latency can be minimized by marking voice packets as being delay-sensitive with QoS methods such as
DiffServ Differentiated services or DiffServ is a computer networking architecture that specifies a mechanism for classifying and managing network traffic and providing quality of service (QoS) on modern IP networks. DiffServ can, for example, be used t ...
. Network routers on high volume traffic links may introduce latency that exceeds permissible thresholds for VoIP. Excessive load on a link can cause congestion and associated
queueing delay In telecommunication and computer engineering, the queuing delay or queueing delay is the time a job waits in a queue until it can be executed. It is a key component of network delay. In a switched network, queuing delay is the time between the c ...
s and
packet loss Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is either caused by errors in data transmission, typically across wireless networks, or network congestion.Kur ...
. This signals a transport protocol like TCP to reduce its transmission rate to alleviate the congestion. But VoIP usually uses UDP not TCP because recovering from congestion through retransmission usually entails too much latency. So QoS mechanisms can avoid the undesirable loss of VoIP packets by immediately transmitting them ahead of any queued bulk traffic on the same link, even when the link is congested by bulk traffic. VoIP endpoints usually have to wait for the completion of transmission of previous packets before new data may be sent. Although it is possible to preempt (abort) a less important packet in mid-transmission, this is not commonly done, especially on high-speed links where transmission times are short even for maximum-sized packets. An alternative to preemption on slower links, such as dialup and
digital subscriber line Digital subscriber line (DSL; originally digital subscriber loop) is a family of technologies that are used to transmit digital data over telephone lines. In telecommunications marketing, the term DSL is widely understood to mean asymmetric di ...
(DSL), is to reduce the maximum transmission time by reducing the
maximum transmission unit In computer networking, the maximum transmission unit (MTU) is the size of the largest protocol data unit (PDU) that can be communicated in a single network layer transaction. The MTU relates to, but is not identical to the maximum frame size that ...
. But since every packet must contain protocol headers, this increases relative header overhead on every link traversed. The receiver must resequence IP packets that arrive out of order and recover gracefully when packets arrive too late or not at all.
Packet delay variation In computer networking, packet delay variation (PDV) is the difference in end-to-end one-way delay between selected packets in a flow with any lost packets being ignored.RFC 3393 The effect is sometimes referred to as packet jitter, although t ...
results from changes in
queuing delay In telecommunication and computer engineering, the queuing delay or queueing delay is the time a job waits in a queue until it can be executed. It is a key component of network delay. In a switched network, queuing delay is the time between the co ...
along a given network path due to competition from other users for the same transmission links. VoIP receivers accommodate this variation by storing incoming packets briefly in a
playout buffer In electronics and telecommunications, jitter is the deviation from true periodicity of a presumably periodic signal, often in relation to a reference clock signal. In clock recovery applications it is called timing jitter. Jitter is a significa ...
, deliberately increasing latency to improve the chance that each packet will be on hand when it is time for the
voice engine A voice engine is a software subsystem for bidirectional audio communication, typically used as part of a telecommunications system to simulate a telephone. It functions like a data pump for audio data, specifically voice data. The voice engine ...
to play it. The added delay is thus a compromise between excessive latency and excessive dropout, i.e. momentary audio interruptions. Although jitter is a random variable, it is the sum of several other random variables that are at least somewhat independent: the individual queuing delays of the routers along the Internet path in question. Motivated by the
central limit theorem In probability theory, the central limit theorem (CLT) establishes that, in many situations, when independent random variables are summed up, their properly normalized sum tends toward a normal distribution even if the original variables themselv ...
, jitter can be modeled as a Gaussian random variable. This suggests continually estimating the mean delay and its standard deviation and setting the playout delay so that only packets delayed more than several standard deviations above the mean will arrive too late to be useful. In practice, the variance in latency of many Internet paths is dominated by a small number (often one) of relatively slow and congested bottleneck links. Most Internet backbone links are now so fast (e.g. 10 Gbit/s) that their delays are dominated by the
transmission Transmission may refer to: Medicine, science and technology * Power transmission ** Electric power transmission ** Propulsion transmission, technology allowing controlled application of power *** Automatic transmission *** Manual transmission *** ...
medium (e.g. optical fiber) and the routers driving them do not have enough buffering for queuing delays to be significant. A number of protocols have been defined to support the reporting of
quality of service Quality of service (QoS) is the description or measurement of the overall performance of a service, such as a telephony or computer network, or a cloud computing service, particularly the performance seen by the users of the network. To quantitat ...
(QoS) and
quality of experience Quality of experience (QoE) is a measure of the delight or annoyance of a customer's experiences with a service (e.g., web browsing, phone call, TV broadcast).Qualinet White Paper on Definitions of Quality of Experience (2012). European Network on Q ...
(QoE) for VoIP calls. These include
RTP Control Protocol The RTP Control Protocol (RTCP) is a sister protocol of the Real-time Transport Protocol (RTP). Its basic functionality and packet structure is defined in RFC 3550. RTCP provides out-of-band statistics and control information for an RTP session. ...
(RTCP) extended reports, SIP RTCP summary reports, H.460.9 Annex B (for
H.323 H.323 is a recommendation from the ITU Telecommunication Standardization Sector (ITU-T) that defines the protocols to provide audio-visual communication sessions on any packet network. The H.323 standard addresses call signaling and control, m ...
),
H.248 The Gateway Control Protocol (Megaco, H.248) is an implementation of the media gateway control protocol architecture for providing telecommunication services across a converged internetwork consisting of the traditional public switched telephone ...
.30 and MGCP extensions. The RTCP extended report VoIP metrics block specified by is generated by an VoIP phone or gateway during a live call and contains information on packet loss rate, packet discard rate (because of jitter), packet loss/discard burst metrics (burst length/density, gap length/density), network delay, end system delay, signal/noise/echo level,
mean opinion score Mean opinion score (MOS) is a measure used in the domain of Quality of Experience and telecommunications engineering, representing overall quality of a stimulus or system. It is the arithmetic mean over all individual "values on a predefined scale t ...
s (MOS) and R factors and configuration information related to the jitter buffer. VoIP metrics reports are exchanged between IP endpoints on an occasional basis during a call, and an end of call message sent via SIP RTCP summary report or one of the other signaling protocol extensions. VoIP metrics reports are intended to support real-time feedback related to QoS problems, the exchange of information between the endpoints for improved call quality calculation and a variety of other applications.


DSL and ATM

DSL modems typically provide Ethernet connections to local equipment, but inside they may actually be
Asynchronous Transfer Mode Asynchronous Transfer Mode (ATM) is a telecommunications standard defined by American National Standards Institute (ANSI) and ITU-T (formerly CCITT) for digital transmission of multiple types of traffic. ATM was developed to meet the needs of ...
(ATM) modems. They use
ATM Adaptation Layer 5 ATM Adaptation Layer 5 (AAL5) is an ATM adaptation layer used to send variable-length packets up to 65,535 octets in size across an Asynchronous Transfer Mode (ATM) network. Unlike most network frames, which place control information in the he ...
(AAL5) to segment each Ethernet packet into a series of 53-byte ATM cells for transmission, reassembling them back into Ethernet frames at the receiving end. Using a separate
virtual circuit identifier Asynchronous Transfer Mode (ATM) is a telecommunications standard defined by American National Standards Institute (ANSI) and ITU-T (formerly CCITT) for digital transmission of multiple types of traffic. ATM was developed to meet the needs of ...
(VCI) for audio over IP has the potential to reduce latency on shared connections. ATM's potential for latency reduction is greatest on slow links because worst-case latency decreases with increasing link speed. A full-size (1500 byte) Ethernet frame takes 94 ms to transmit at 128 kbit/s but only 8 ms at 1.5 Mbit/s. If this is the bottleneck link, this latency is probably small enough to ensure good VoIP performance without MTU reductions or multiple ATM VCs. The latest generations of DSL,
VDSL Very high-speed digital subscriber line (VDSL) and very high-speed digital subscriber line 2 (VDSL2) are digital subscriber line (DSL) technologies providing data transmission faster than the earlier standards of asymmetric digital subscriber line ...
and
VDSL2 Very high-speed digital subscriber line (VDSL) and very high-speed digital subscriber line 2 (VDSL2) are digital subscriber line (DSL) technologies providing data transmission faster than the earlier standards of asymmetric digital subscriber li ...
, carry Ethernet without intermediate ATM/AAL5 layers, and they generally support IEEE 802.1p priority tagging so that VoIP can be queued ahead of less time-critical traffic. ATM has substantial header overhead: 5/53 = 9.4%, roughly twice the total header overhead of a 1500 byte Ethernet frame. This "ATM tax" is incurred by every DSL user whether or not they take advantage of multiple virtual circuits – and few can.


Layer 2

Several protocols are used in the
data link layer The data link layer, or layer 2, is the second layer of the seven-layer OSI model of computer networking. This layer is the protocol layer that transfers data between nodes on a network segment across the physical layer. The data link layer p ...
and
physical layer In the seven-layer OSI model of computer networking, the physical layer or layer 1 is the first and lowest layer; The layer most closely associated with the physical connection between devices. This layer may be implemented by a PHY chip. The ...
for quality-of-service mechanisms that help VoIP applications work well even in the presence of
network congestion Network congestion in data networking and queueing theory is the reduced quality of service that occurs when a network node or link is carrying more data than it can handle. Typical effects include queueing delay, packet loss or the blocking of ...
. Some examples include: *
IEEE 802.11e IEEE 802.11e-2005 or 802.11e is an approved amendment to the IEEE 802.11 standard that defines a set of quality of service (QoS) enhancements for wireless LAN applications through modifications to the media access control (MAC) layer.M. Benveni ...
is an approved amendment to the
IEEE 802.11 IEEE 802.11 is part of the IEEE 802 set of local area network (LAN) technical standards, and specifies the set of media access control (MAC) and physical layer (PHY) protocols for implementing wireless local area network (WLAN) computer commun ...
standard that defines a set of quality-of-service enhancements for wireless LAN applications through modifications to the
Media Access Control In IEEE 802 LAN/MAN standards, the medium access control (MAC, also called media access control) sublayer is the layer that controls the hardware responsible for interaction with the wired, optical or wireless transmission medium. The MAC sublay ...
(MAC) layer. The standard is considered of critical importance for delay-sensitive applications, such as voice over wireless IP. * IEEE 802.1p defines 8 different classes of service (including one dedicated to voice) for traffic on layer-2 wired
Ethernet Ethernet () is a family of wired computer networking technologies commonly used in local area networks (LAN), metropolitan area networks (MAN) and wide area networks (WAN). It was commercially introduced in 1980 and first standardized in 198 ...
. * The
ITU-T The ITU Telecommunication Standardization Sector (ITU-T) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU). It is responsible for coordinating standards for telecommunications and Information Commu ...
G.hn G.hn is a specification for home networking with data rates up to 2 Gbit/s and operation over four types of legacy wires: telephone wiring, coaxial cables, power lines and plastic optical fiber. A single G.hn semiconductor device is able to n ...
standard, which provides a way to create a high-speed (up to 1 gigabit per second)
Local area network A local area network (LAN) is a computer network that interconnects computers within a limited area such as a residence, school, laboratory, university campus or office building. By contrast, a wide area network (WAN) not only covers a larger ...
(LAN) using existing home wiring (
power lines Electric power transmission is the bulk movement of electrical energy from a generating site, such as a power plant, to an electrical substation. The interconnected lines that facilitate this movement form a ''transmission network''. This is d ...
, phone lines and
coaxial cables Coaxial cable, or coax (pronounced ) is a type of electrical cable consisting of an inner conductor surrounded by a concentric conducting shield, with the two separated by a dielectric ( insulating material); many coaxial cables also have a ...
). G.hn provides QoS by means of Contention-Free Transmission Opportunities (CFTXOPs) which are allocated to flows (such as a VoIP call) that require QoS and which have negotiated a ''contract'' with the network controllers.


Performance metrics

The quality of voice transmission is characterized by several metrics that may be monitored by network elements and by the user agent hardware or software. Such metrics include network
packet loss Packet loss occurs when one or more packets of data travelling across a computer network fail to reach their destination. Packet loss is either caused by errors in data transmission, typically across wireless networks, or network congestion.Kur ...
, packet
jitter In electronics and telecommunications, jitter is the deviation from true periodicity of a presumably periodic signal, often in relation to a reference clock signal. In clock recovery applications it is called timing jitter. Jitter is a significa ...
, packet latency (delay), post-dial delay, and echo. The metrics are determined by VoIP performance testing and monitoring.


PSTN integration

A VoIP media gateway controller (aka Class 5 Softswitch) works in cooperation with a media gateway (aka IP Business Gateway) and connects the digital media stream, so as to complete the path for voice and data. Gateways include interfaces for connecting to standard PSTN networks. Ethernet interfaces are also included in the modern systems which are specially designed to link calls that are passed via VoIP. E.164 is a global numbering standard for both the PSTN and
public land mobile network In telecommunication, a public land mobile network (PLMN) is a combination of wireless communication services offered by a specific operator in a specific country.3GPP TS 21.905 https://www.etsi.org/deliver/etsi_tr/121900_121999/121905/15.00.00_6 ...
(PLMN). Most VoIP implementations support E.164 to allow calls to be routed to and from VoIP subscribers and the PSTN/PLMN. VoIP implementations can also allow other identification techniques to be used. For example,
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
allows subscribers to choose ''Skype names'' (usernames) whereas SIP implementations can use
Uniform Resource Identifier A Uniform Resource Identifier (URI) is a unique sequence of characters that identifies a logical or physical resource used by web technologies. URIs may be used to identify anything, including real-world objects, such as people and places, conc ...
(URIs) similar to
email addresses An email address identifies an email box to which messages are delivered. While early messaging systems used a variety of formats for addressing, today, email addresses follow a set of specific rules originally standardized by the Internet Engineer ...
. Often VoIP implementations employ methods of translating non-E.164 identifiers to E.164 numbers and vice versa, such as the Skype-In service provided by Skype and the E.164 number to URI mapping (ENUM) service in IMS and SIP. Echo can also be an issue for PSTN integration. Common causes of echo include impedance mismatches in analog circuitry and an acoustic path from the receive to transmit signal at the receiving end.


Number portability

Local number portability Local number portability (LNP) for fixed lines, and full mobile number portability (FMNP) for mobile phone lines, refers to the ability of a "customer of record" of an existing fixed-line or mobile telephone number assigned by a local exchange ca ...
(LNP) and
mobile number portability Mobile number portability (MNP) enables mobile telephone A mobile phone, cellular phone, cell phone, cellphone, handphone, hand phone or pocket phone, sometimes shortened to simply mobile, cell, or just phone, is a portable telephone that ...
(MNP) also impact VoIP business. Number portability is a service that allows a subscriber to select a new telephone carrier without requiring a new number to be issued. Typically, it is the responsibility of the former carrier to "map" the old number to the undisclosed number assigned by the new carrier. This is achieved by maintaining a database of numbers. A dialed number is initially received by the original carrier and quickly rerouted to the new carrier. Multiple porting references must be maintained even if the subscriber returns to the original carrier. The FCC mandates carrier compliance with these consumer-protection stipulations. In November 2007, the
Federal Communications Commission The Federal Communications Commission (FCC) is an independent agency of the United States federal government that regulates communications by radio, television, wire, satellite, and cable across the United States. The FCC maintains jurisdiction ...
in the United States released an order extending number portability obligations to interconnected VoIP providers and carriers that support VoIP providers. A voice call originating in the VoIP environment also faces
least-cost routing In voice telecommunications, least-cost routing (LCR) is the process of selecting the path of outbound communications traffic based on cost. Within a telecoms carrier, an LCR team might periodically (monthly, weekly or even daily) choose between ro ...
(LCR) challenges to reach its destination if the number is routed to a mobile phone number on a traditional mobile carrier. LCR is based on checking the destination of each telephone call as it is made, and then sending the call via the network that will cost the customer the least. This rating is subject to some debate given the complexity of call routing created by number portability. With MNP in place, LCR providers can no longer rely on using the network root prefix to determine how to route a call. Instead, they must now determine the actual network of every number before routing the call. Therefore, VoIP solutions also need to handle MNP when routing a voice call. In countries without a central database, like the UK, it may be necessary to query the mobile network about which home network a mobile phone number belongs to. As the popularity of VoIP increases in the enterprise markets because of LCR options, VoIP needs to provide a certain level of reliability when handling calls.


Emergency calls

A telephone connected to a
land line A landline (land line, land-line, main line, home phone, fixed-line, and wireline) is a telephone connection that uses metal wires or optical fiber telephone line for transmission, as distinguished from a mobile cellular network, which uses ...
has a direct relationship between a telephone number and a physical location, which is maintained by the telephone company and available to emergency responders via the national emergency response service centers in form of emergency subscriber lists. When an emergency call is received by a center the location is automatically determined from its databases and displayed on the operator console. In IP telephony, no such direct link between location and communications end point exists. Even a provider having wired infrastructure, such as a DSL provider, may know only the approximate location of the device, based on the
IP address An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
allocated to the network router and the known service address. Some ISPs do not track the automatic assignment of IP addresses to customer equipment. IP communication provides for device mobility. For example, a residential broadband connection may be used as a link to a
virtual private network A virtual private network (VPN) extends a private network across a public network and enables users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network. The be ...
of a corporate entity, in which case the IP address being used for customer communications may belong to the enterprise, not the residential ISP. Such
off-premises extension {{Unreferenced, date=December 2009 An off-premises extension (OPX), sometimes also known as off-premises station (OPS), is an extension telephone at a location distant from its servicing exchange. One type of off-premises extension, connected to ...
s may appear as part of an upstream IP PBX. On mobile devices, e.g., a 3G handset or USB wireless broadband adapter, the IP address has no relationship with any physical location known to the telephony service provider, since a mobile user could be anywhere in a region with network coverage, even roaming via another cellular company. At the VoIP level, a phone or gateway may identify itself by its account credentials with a
Session Initiation Protocol The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telepho ...
(SIP) registrar. In such cases, the
Internet telephony service provider An Internet telephony service provider (ITSP) offers digital telecommunications services based on Voice over Internet Protocol (VoIP) that are provisioned via the Internet. ITSPs provide services to end-users directly or as whole-sale suppliers to ...
(ITSP) knows only that a particular user's equipment is active. Service providers often provide emergency response services by agreement with the user who registers a physical location and agrees that, if an emergency number is called from the IP device, emergency services are provided to that address only. Such emergency services are provided by VoIP vendors in the United States by a system called
Enhanced 911 Enhanced 911, E-911 or E911 is a system used in North America to automatically provide the caller's location to 911 dispatchers. 911 is the universal emergency telephone number in the region. In the European Union, a similar system exists known as ...
(E911), based on the Wireless Communications and Public Safety Act. The VoIP E911 emergency-calling system associates a physical address with the calling party's telephone number. All VoIP providers that provide access to the public switched telephone network are required to implement E911, a service for which the subscriber may be charged. "VoIP providers may not allow customers to opt-out of 911 service." The VoIP E911 system is based on a static table lookup. Unlike in cellular phones, where the location of an E911 call can be traced using
assisted GPS Assisted GNSS (A-GNSS) is a GNSS augmentation system that often significantly improves the startup performance—i.e., time-to-first-fix (TTFF)—of a global navigation satellite system (GNSS). A-GNSS works by providing the necessary data to the ...
or other methods, the VoIP E911 information is accurate only if subscribers keep their emergency address information current.


Fax support

Sending
fax Fax (short for facsimile), sometimes called telecopying or telefax (the latter short for telefacsimile), is the telephonic transmission of scanned printed material (both text and images), normally to a telephone number connected to a printer o ...
es over VoIP networks is sometimes referred to as Fax over IP (FoIP). Transmission of fax documents was problematic in early VoIP implementations, as most voice digitization and compression
codec A codec is a device or computer program that encodes or decodes a data stream or signal. ''Codec'' is a portmanteau of coder/decoder. In electronic communications, an endec is a device that acts as both an encoder and a decoder on a signal or da ...
s are optimized for the representation of the human voice and the proper timing of the modem signals cannot be guaranteed in a packet-based, connectionless network. A standards-based solution for reliably delivering fax-over-IP is the
T.38 T.38 is an ITU recommendation for allowing transmission of fax over IP networks (FoIP) in real time. History The T.38 fax relay standard was devised in 1998 as a way to permit faxes to be transported across IP networks between existing Group ...
protocol. The T.38 protocol is designed to compensate for the differences between traditional packet-less communications over analog lines and packet-based transmissions which are the basis for IP communications. The fax machine may be a standard device connected to an
analog telephone adapter An analog telephone adapter (ATA) is a device for connecting traditional analog telephones, fax machines, and similar customer-premises devices to a digital telephone system or a voice over IP telephony network. An ATA is often built into a sma ...
(ATA), or it may be a software application or dedicated network device operating via an Ethernet interface. Originally, T.38 was designed to use UDP or TCP transmission methods across an IP network. Some newer high-end fax machines have built-in T.38 capabilities which are connected directly to a network switch or router. In T.38 each packet contains a portion of the data stream sent in the previous packet. Two successive packets have to be lost to actually lose data integrity.


Power requirements

Telephones for traditional residential analog service are usually connected directly to telephone company
phone line A telephone line or telephone circuit (or just line or circuit industrywide) is a single-user circuit on a telephone communication system. It is designed to reproduce speech of a quality that is understandable. It is the physical wire or ot ...
s which provide direct current to power most basic analog handsets independently of locally available electrical power. The susceptibility of phone service to power failures is a common problem even with traditional analog service where customers purchase telephone units that operate with wireless handsets to a base station, or that have other modern phone features, such as built-in voicemail or phone book features. VoIP phones and VoIP telephone adapters connect to routers or
cable modem A cable modem is a type of network bridge that provides bi-directional data communication via radio frequency channels on a hybrid fibre-coaxial (HFC), radio frequency over glass (RFoG) and coaxial cable infrastructure. Cable modems are primaril ...
s which typically depend on the availability of
mains electricity Mains electricity or utility power, power grid, domestic power, and wall power, or in some parts of Canada as hydro, is a general-purpose alternating-current (AC) electric power supply. It is the form of electrical power that is delivered to h ...
or locally generated power. Some VoIP service providers use customer premises equipment (e.g., cable modems) with battery-backed power supplies to assure uninterrupted service for up to several hours in case of local power failures. Such battery-backed devices typically are designed for use with analog handsets. Some VoIP service providers implement services to route calls to other telephone services of the subscriber, such a cellular phone, in the event that the customer's network device is inaccessible to terminate the call.


Security

Secure calls are possible using standardized protocols such as
Secure Real-time Transport Protocol The Secure Real-time Transport Protocol (SRTP) is a profile for Real-time Transport Protocol (RTP) intended to provide encryption, message authentication and integrity, and replay attack protection to the RTP data in both unicast and multicas ...
. Most of the facilities of creating a
secure telephone A secure telephone is a telephone that provides Secure voice, voice security in the form of end-to-end encryption for the telephone call, and in some cases also the mutual authentication of the call parties, protecting them against a man-in-the-mi ...
connection over traditional phone lines, such as digitizing and digital transmission, are already in place with VoIP. It is necessary only to
encrypt In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can decip ...
and
authenticate Authentication (from ''authentikos'', "real, genuine", from αὐθέντης ''authentes'', "author") is the act of proving an assertion, such as the identity of a computer system user. In contrast with identification, the act of indicatin ...
the existing data stream. Automated software, such as a virtual PBX, may eliminate the need for personnel to greet and switch incoming calls. The security concerns for VoIP telephone systems are similar to those of other Internet-connected devices. This means that
hacker A hacker is a person skilled in information technology who uses their technical knowledge to achieve a goal or overcome an obstacle, within a computerized system by non-standard means. Though the term ''hacker'' has become associated in popu ...
s with knowledge of
VoIP vulnerabilities VoIP is vulnerable to similar types of attacks that Web connection and emails are prone to. VoIP attractiveness, because of its low fixed cost and numerous features, come with some risks that are well known to the developers an are constantly being ...
can perform
denial-of-service In computing, a denial-of-service attack (DoS attack) is a cyber-attack in which the perpetrator seeks to make a machine or network resource unavailable to its intended users by temporarily or indefinitely disrupting services of a host conne ...
attacks, harvest customer data, record conversations, and compromise voicemail messages. Compromised VoIP user account or session credentials may enable an attacker to incur substantial charges from third-party services, such as long-distance or international calling. The technical details of many VoIP protocols create challenges in routing VoIP traffic through firewalls and network address translators, used to interconnect to transit networks or the Internet. Private
session border controller A session border controller (SBC) is a network element deployed to protect SIP based voice over Internet Protocol (VoIP) networks. Early deployments of SBCs were focused on the borders between two service provider networks in a peering environme ...
s are often employed to enable VoIP calls to and from protected networks. Other methods to traverse NAT devices involve assistive protocols such as
STUN STUN (Session Traversal Utilities for NAT; originally Simple Traversal of User Datagram Protocol (UDP) through Network Address Translators) is a standardized set of methods, including a network protocol, for traversal of network address transl ...
and
Interactive Connectivity Establishment Interactive Connectivity Establishment (ICE) is a technique used in computer networking to find ways for two computers to talk to each other as directly as possible in peer-to-peer networking. This is most commonly used for interactive media such a ...
(ICE). Standards for securing VoIP are available in the
Secure Real-time Transport Protocol The Secure Real-time Transport Protocol (SRTP) is a profile for Real-time Transport Protocol (RTP) intended to provide encryption, message authentication and integrity, and replay attack protection to the RTP data in both unicast and multicas ...
(SRTP) and the
ZRTP ZRTP (composed of Z and Real-time Transport Protocol) is a cryptographic key-agreement protocol to negotiate the keys for encryption between two end points in a Voice over IP (VoIP) phone telephony call based on the Real-time Transport Protocol. ...
protocol for
analog telephony adapter An analog telephone adapter (ATA) is a device for connecting traditional analog telephones, fax machines, and similar customer-premises devices to a digital telephone system or a voice over IP telephony network. An ATA is often built into a sma ...
s, as well as for some
softphone A softphone is a software program for making telephone calls over the Internet using a general purpose computer rather than dedicated hardware. The softphone can be installed on a piece of equipment such as a desktop, mobile device, or other comp ...
s.
IPsec In computing, Internet Protocol Security (IPsec) is a secure network protocol suite that authenticates and encrypts packets of data to provide secure encrypted communication between two computers over an Internet Protocol network. It is used in ...
is available to secure point-to-point VoIP at the transport level by using
opportunistic encryption Opportunistic encryption (OE) refers to any system that, when connecting to another system, attempts to encrypt communications channels, otherwise falling back to unencrypted communications. This method requires no pre-arrangement between the two ...
. Though many consumer VoIP solutions do not support encryption of the signaling path or the media, securing a VoIP phone is conceptually easier to implement using VoIP than on traditional telephone circuits. A result of the lack of widespread support for encryption is that it is relatively easy to eavesdrop on VoIP calls when access to the data network is possible. Free open-source solutions, such as
Wireshark Wireshark is a free and open-source packet analyzer. It is used for network troubleshooting, analysis, software and communications protocol development, and education. Originally named Ethereal, the project was renamed Wireshark in May 2006 d ...
, facilitate capturing VoIP conversations. Government and military organizations use various security measures to protect VoIP traffic, such as voice over secure IP (VoSIP), secure voice over IP (SVoIP), and secure voice over secure IP (SVoSIP). The distinction lies in whether encryption is applied in the telephone endpoint or in the network. Secure voice over secure IP may be implemented by encrypting the media with protocols such as SRTP and
ZRTP ZRTP (composed of Z and Real-time Transport Protocol) is a cryptographic key-agreement protocol to negotiate the keys for encryption between two end points in a Voice over IP (VoIP) phone telephony call based on the Real-time Transport Protocol. ...
. Secure voice over IP uses
Type 1 encryption The U.S. National Security Agency (NSA) used to rank cryptographic products or algorithms by a certification called product types. Product types were defined in the National Information Assurance Glossary (CNSSI No. 4009, 2010) which used to define ...
on a classified network, such as
SIPRNet The Secure Internet Protocol Router Network (SIPRNet) is "a system of interconnected computer networks used by the U.S. Department of Defense and the U.S. Department of State to transmit classified information (up to and including information cla ...
. Public Secure VoIP is also available with free GNU software and in many popular commercial VoIP programs via libraries, such as ZRTP. In June 2021, the
NSA The National Security Agency (NSA) is a national-level intelligence agency of the United States Department of Defense, under the authority of the Director of National Intelligence (DNI). The NSA is responsible for global monitoring, collecti ...
(National Security Agency) released comprehensive documents describing the four attack planes of a communications system – the network, perimeter, session controllers and endpoints – and explaining security risks and mitigation techniques for each of them.


Caller ID

Voice over IP protocols and equipment provide
caller ID Caller identification (Caller ID) is a telephone service, available in analog and digital telephone systems, including voice over IP (VoIP), that transmits a caller's telephone number to the called party's telephone equipment when the call is ...
support that is compatible with the PSTN. Many VoIP service providers also allow callers to configure custom caller ID information.


Hearing aid compatibility

Wireline telephones which are manufactured in, imported to, or intended to be used in the US with Voice over IP service, on or after February 28, 2020, are required to meet the
hearing aid A hearing aid is a device designed to improve hearing by making sound audible to a person with hearing loss. Hearing aids are classified as medical devices in most countries, and regulated by the respective regulations. Small audio amplifiers su ...
compatibility requirements set forth by the
Federal Communications Commission The Federal Communications Commission (FCC) is an independent agency of the United States federal government that regulates communications by radio, television, wire, satellite, and cable across the United States. The FCC maintains jurisdiction ...
.


Operational cost

VoIP has drastically reduced the cost of communication by sharing network infrastructure between data and voice. A single broadband connection has the ability to transmit multiple telephone calls.


Regulatory and legal issues

As the popularity of VoIP grows, governments are becoming more interested in regulating VoIP in a manner similar to PSTN services. Throughout the developing world, particularly in countries where regulation is weak or captured by the dominant operator, restrictions on the use of VoIP are often imposed, including in
Panama Panama ( , ; es, link=no, Panamá ), officially the Republic of Panama ( es, República de Panamá), is a transcontinental country spanning the southern part of North America and the northern part of South America. It is bordered by Cos ...
where VoIP is taxed, Guyana where VoIP is prohibited. In
Ethiopia Ethiopia, , om, Itiyoophiyaa, so, Itoobiya, ti, ኢትዮጵያ, Ítiyop'iya, aa, Itiyoppiya officially the Federal Democratic Republic of Ethiopia, is a landlocked country in the Horn of Africa. It shares borders with Eritrea to the ...
, where the government is nationalizing telecommunication service, it is a criminal offense to offer services using VoIP. The country has installed firewalls to prevent international calls from being made using VoIP. These measures were taken after the popularity of VoIP reduced the income generated by the state-owned
telecommunication Telecommunication is the transmission of information by various types of technologies over wire, radio, optical, or other electromagnetic systems. It has its origin in the desire of humans for communication over a distance greater than that fe ...
company.


Canada

In
Canada Canada is a country in North America. Its ten provinces and three territories extend from the Atlantic Ocean to the Pacific Ocean and northward into the Arctic Ocean, covering over , making it the world's second-largest country by tot ...
, the
Canadian Radio-television and Telecommunications Commission The Canadian Radio-television and Telecommunications Commission (CRTC; french: Conseil de la radiodiffusion et des télécommunications canadiennes, links=) is a public organization in Canada with mandate as a regulatory agency for broadcasti ...
regulates telephone service, including VoIP telephony service. VoIP services operating in Canada are required to provide
9-1-1 , usually written 911, is an emergency telephone number for the United States, Canada, Mexico, Panama, Palau, Argentina, Philippines, Jordan, as well as the North American Numbering Plan (NANP), one of eight N11 codes. Like other emergency nu ...
emergency service.


European Union

In the
European Union The European Union (EU) is a supranational political and economic union of member states that are located primarily in Europe. The union has a total area of and an estimated total population of about 447million. The EU has often been des ...
, the treatment of VoIP service providers is a decision for each national telecommunications regulator, which must use competition law to define relevant national markets and then determine whether any service provider on those national markets has "significant market power" (and so should be subject to certain obligations). A general distinction is usually made between VoIP services that function over managed networks (via broadband connections) and VoIP services that function over unmanaged networks (essentially, the Internet). The relevant EU Directive is not clearly drafted concerning obligations that can exist independently of market power (e.g., the obligation to offer access to emergency calls), and it is impossible to say definitively whether VoIP service providers of either type are bound by them.


Arab states of the GCC


Oman

In
Oman Oman ( ; ar, عُمَان ' ), officially the Sultanate of Oman ( ar, سلْطنةُ عُمان ), is an Arabian country located in southwestern Asia. It is situated on the southeastern coast of the Arabian Peninsula, and spans the mouth of t ...
, it is illegal to provide or use unauthorized VoIP services, to the extent that web sites of unlicensed VoIP providers have been blocked. Violations may be punished with fines of 50,000 Omani Rial (about 130,317 US dollars), a two-year prison sentence or both. In 2009, police raided 121 Internet cafes throughout the country and arrested 212 people for using or providing VoIP services.


Saudi Arabia

In September 2017,
Saudi Arabia Saudi Arabia, officially the Kingdom of Saudi Arabia (KSA), is a country in Western Asia. It covers the bulk of the Arabian Peninsula, and has a land area of about , making it the fifth-largest country in Asia, the second-largest in the A ...
lifted the ban on VoIPs, in an attempt to reduce operational costs and spur digital entrepreneurship.


United Arab Emirates

In the
United Arab Emirates The United Arab Emirates (UAE; ar, اَلْإِمَارَات الْعَرَبِيَة الْمُتَحِدَة ), or simply the Emirates ( ar, الِْإمَارَات ), is a country in Western Asia (The Middle East). It is located at th ...
(UAE), it is illegal to provide or use unauthorized VoIP services. Web sites of unlicensed VoIP providers have been blocked. Some VoIP services such as
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
were allowed. In January 2018, internet service providers in UAE blocked all VoIP apps, including Skype, but permitting only 2 government-approved VoIP apps (C’ME and BOTIM). In opposition, a petition on ''Change.org'' garnered over 5000 signatures, in response to which the website was blocked in UAE. On March 24, 2020, the United Arab Emirates loosened restriction on VoIP services earlier prohibited in the country, to ease communication during the
COVID-19 pandemic The COVID-19 pandemic, also known as the coronavirus pandemic, is an ongoing global pandemic of coronavirus disease 2019 (COVID-19) caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The novel virus was first identif ...
. However, popular instant messaging applications like
WhatsApp WhatsApp (also called WhatsApp Messenger) is an internationally available freeware, cross-platform, centralized instant messaging (IM) and voice-over-IP (VoIP) service owned by American company Meta Platforms (formerly Facebook). It allows us ...
,
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
, and
FaceTime FaceTime is a Proprietary software, proprietary videotelephony product developed by Apple Inc. FaceTime is available on supported iOS mobile devices running iOS 4 and later and Mac computers that run and later. FaceTime supports any iOS devic ...
remained blocked from being used for voice and video calls, constricting residents to use paid services from the country's state-owned telecom providers.


India

In
India India, officially the Republic of India (Hindi: ), is a country in South Asia. It is the seventh-largest country by area, the second-most populous country, and the most populous democracy in the world. Bounded by the Indian Ocean on the so ...
, it is legal to use VoIP, but it is illegal to have VoIP gateways inside India. This effectively means that people who have PCs can use them to make a VoIP call to other computers but not to a normal phone number. Foreign-based VoIP server services are illegal to use in India. Internet telephony is permitted to the ISP with restrictions. The following services are permitted: # PC to PC; within or outside India # PC / a device / Adapter conforming to the standard of any international agencies like- ITU or IETF etc. in India to PSTN/PLMN abroad. # Any device / Adapter conforming to standards of International agencies like ITU, IETF etc. connected to ISP node with static IP address to similar device / Adapter; within or outside India. # Except whatever is described in , no other form of Internet Telephony is permitted. # In India no Separate Numbering Scheme is provided to the Internet Telephony. Presently the 10 digit Numbering allocation based on E.164 is permitted to the Fixed Telephony, GSM, CDMA wireless service. For Internet Telephony, the numbering scheme shall only conform to IP addressing Scheme of Internet Assigned Numbers Authority (IANA). Translation of E.164 number / private number to IP address allotted to any device and vice versa, by ISP to show compliance with IANA numbering scheme is not permitted. # The Internet Service Licensee is not permitted to have PSTN/PLMN connectivity. Voice communication to and from a telephone connected to PSTN/PLMN and following E.164 numbering is prohibited in India.


South Korea

In
South Korea South Korea, officially the Republic of Korea (ROK), is a country in East Asia, constituting the southern part of the Korea, Korean Peninsula and sharing a Korean Demilitarized Zone, land border with North Korea. Its western border is formed ...
, only providers registered with the government are authorized to offer VoIP services. Unlike many VoIP providers, most of whom offer flat rates, Korean VoIP services are generally metered and charged at rates similar to terrestrial calling. Foreign VoIP providers encounter high barriers to government registration. This issue came to a head in 2006 when
Internet service providers An Internet service provider (ISP) is an organization that provides services for accessing, using, or participating in the Internet. ISPs can be organized in various forms, such as commercial, community-owned, non-profit, or otherwise privatel ...
providing personal Internet services by contract to
United States Forces Korea United States Forces Korea (USFK) is a Unified Combatant Command#Subordinate Unified Command, sub-unified command of United States Indo-Pacific Command, U.S. Indo-Pacific Command (USINDOPACOM). USFK is the joint headquarters for U.S. combat-re ...
(USFK) members residing on USFK bases threatened to block off access to VoIP services used by USFK members as an economical way to keep in contact with their families in the United States, on the grounds that the service members' VoIP providers were not registered. A compromise was reached between USFK and Korean telecommunications officials in January 2007, wherein USFK service members arriving in Korea before June 1, 2007, and subscribing to the ISP services provided on base could continue to use their US-based VoIP subscription, but later arrivals are required to use a Korean-based VoIP provider, which by contract will offer pricing similar to the flat rates offered by US VoIP providers.


United States

In the United States, the
Federal Communications Commission The Federal Communications Commission (FCC) is an independent agency of the United States federal government that regulates communications by radio, television, wire, satellite, and cable across the United States. The FCC maintains jurisdiction ...
requires all interconnected VoIP service providers to comply with requirements comparable to those for traditional telecommunications service providers. VoIP operators in the US are required to support
local number portability Local number portability (LNP) for fixed lines, and full mobile number portability (FMNP) for mobile phone lines, refers to the ability of a "customer of record" of an existing fixed-line or mobile telephone number assigned by a local exchange ca ...
; make service accessible to people with disabilities; pay regulatory fees,
universal service Universal service is an economic, legal and business term used mostly in regulated industries, referring to the practice of providing a baseline level of services to every resident of a country. An example of this concept is found in the US Telec ...
contributions, and other mandated payments; and enable law enforcement authorities to conduct surveillance pursuant to the
Communications Assistance for Law Enforcement Act The Communications Assistance for Law Enforcement Act (CALEA), also known as the "Digital Telephony Act," is a United States wiretapping law passed in 1994, during the presidency of Bill Clinton (Pub. L. No. 103-414, 108 Stat. 4279, codified at 47 ...
(CALEA). Operators of ''Interconnected'' VoIP (fully connected to the PSTN) are mandated to provide
Enhanced 911 Enhanced 911, E-911 or E911 is a system used in North America to automatically provide the caller's location to 911 dispatchers. 911 is the universal emergency telephone number in the region. In the European Union, a similar system exists known as ...
service without special request, provide for customer location updates, clearly disclose any limitations on their E-911 functionality to their consumers, obtain affirmative acknowledgements of these disclosures from all consumers, and may not allow their customers to opt-out of 911 service. VoIP operators also receive the benefit of certain US telecommunications regulations, including an entitlement to
interconnection In telecommunications, interconnection is the physical linking of a carrier's network with equipment or facilities not belonging to that network. The term may refer to a connection between a carrier's facilities and the equipment belonging to ...
and exchange of traffic with incumbent local exchange carriers via wholesale carriers. Providers of ''nomadic'' VoIP service—those who are unable to determine the location of their users—are exempt from state telecommunications regulation. Another legal issue that the
US Congress The United States Congress is the legislature of the federal government of the United States. It is bicameral, composed of a lower body, the House of Representatives, and an upper body, the Senate. It meets in the U.S. Capitol in Washingto ...
is debating concerns changes to the
Foreign Intelligence Surveillance Act The Foreign Intelligence Surveillance Act of 1978 ("FISA" , ) is a United States federal law that establishes procedures for the physical and electronic surveillance and the collection of "foreign intelligence information" between "foreign po ...
. The issue in question is calls between Americans and foreigners. The National Security Agency (NSA) is not authorized to tap Americans' conversations without a warrant—but the Internet, and specifically VoIP does not draw as clear a line to the location of a caller or a call's recipient as the traditional phone system does. As VoIP's low cost and flexibility convinces more and more organizations to adopt the technology, the surveillance for law enforcement agencies becomes more difficult. VoIP technology has also increased Federal security concerns because VoIP and similar technologies have made it more difficult for the government to determine where a target is physically located when communications are being intercepted, and that creates a whole set of new legal challenges.


History

The early developments of
packet network In telecommunications, packet switching is a method of grouping data into '' packets'' that are transmitted over a digital network. Packets are made of a header and a payload. Data in the header is used by networking hardware to direct the pack ...
designs by
Paul Baran Paul Baran (born Pesach Baran ; April 29, 1926 – March 26, 2011) was a Polish-American engineer who was a pioneer in the development of computer networks. He was one of the two independent inventors of packet switching, which is today the dom ...
and other researchers were motivated by a desire for a higher degree of circuit redundancy and network availability in the face of infrastructure failures than was possible in the circuit-switched networks in
telecommunications Telecommunication is the transmission of information by various types of technologies over wire, radio, optical, or other electromagnetic systems. It has its origin in the desire of humans for communication over a distance greater than that fe ...
of the mid-twentieth century. Danny Cohen first demonstrated a form of packet voice in 1973 as part of a flight simulator application, which operated across the early
ARPANET The Advanced Research Projects Agency Network (ARPANET) was the first wide-area packet-switched network with distributed control and one of the first networks to implement the TCP/IP protocol suite. Both technologies became the technical fou ...
. On the early ARPANET, real-time voice communication was not possible with uncompressed
pulse-code modulation Pulse-code modulation (PCM) is a method used to digitally represent sampled analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the ...
(PCM) digital speech packets, which had a
bit rate In telecommunications and computing, bit rate (bitrate or as a variable ''R'') is the number of bits that are conveyed or processed per unit of time. The bit rate is expressed in the unit bit per second (symbol: bit/s), often in conjunction w ...
of 64kbps, much greater than the 2.4kbps
bandwidth Bandwidth commonly refers to: * Bandwidth (signal processing) or ''analog bandwidth'', ''frequency bandwidth'', or ''radio bandwidth'', a measure of the width of a frequency range * Bandwidth (computing), the rate of data transfer, bit rate or thr ...
of early
modems A modulator-demodulator or modem is a computer hardware device that converts data from a digital format into a format suitable for an analog transmission medium such as telephone or radio. A modem transmits data by Modulation#Digital modulati ...
. The solution to this problem was
linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
(LPC), a
speech coding Speech coding is an application of data compression of digital audio signals containing speech. Speech coding uses speech-specific parameter estimation using audio signal processing techniques to model the speech signal, combined with generic da ...
data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
algorithm that was first proposed by
Fumitada Itakura is a Japanese scientist. He did pioneering work in statistical signal processing, and its application to speech analysis, synthesis and coding, including the development of the linear predictive coding (LPC) and line spectral pairs (LSP) methods. ...
of
Nagoya University , abbreviated to or NU, is a Japanese national research university located in Chikusa-ku, Nagoya. It was the seventh Imperial University in Japan, one of the first five Designated National University and selected as a Top Type university of T ...
and Shuzo Saito of
Nippon Telegraph and Telephone , commonly known as NTT, is a Japanese telecommunications company headquartered in Tokyo, Japan. Ranked 55th in Fortune Global 500, ''Fortune'' Global 500, NTT is the fourth largest telecommunications company in the world in terms of revenue, as w ...
(NTT) in 1966. LPC was capable of speech compression down to 2.4kbps, leading to the first successful real-time conversation over ARPANET in 1974, between Culler-Harrison Incorporated in
Goleta, California Goleta (; ; Spanish for "Schooner") is a city in southern Santa Barbara County, California, United States. It was incorporated as a city in 2002, after a long period as the largest unincorporated populated area in the county. As of the 2000 c ...
, and
MIT Lincoln Laboratory The MIT Lincoln Laboratory, located in Lexington, Massachusetts, is a United States Department of Defense federally funded research and development center chartered to apply advanced technology to problems of national security. Research and dev ...
in
Lexington, Massachusetts Lexington is a suburban town in Middlesex County, Massachusetts, United States. It is 10 miles (16 km) from Downtown Boston. The population was 34,454 as of the 2020 census. The area was originally inhabited by Native Americans, and was firs ...
. LPC has since been the most widely used speech coding method.
Code-excited linear prediction Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algori ...
(CELP), a type of LPC algorithm, was developed by Manfred R. Schroeder and Bishnu S. Atal in 1985.M. R. Schroeder and B. S. Atal, "Code-excited linear prediction (CELP): high-quality speech at very low bit rates," in ''Proceedings of the IEEE
International Conference on Acoustics, Speech, and Signal Processing ICASSP, the International Conference on Acoustics, Speech, and Signal Processing, is an annual flagship conference organized of IEEE Signal Processing Society. All papers included in its proceedings have been indexed by Ei Compendex. The first ICAS ...
'' (ICASSP), vol. 10, pp. 937–940, 1985.
LPC algorithms remain an
audio coding standard An audio coding format (or sometimes audio compression format) is a content representation format for storage or transmission of digital audio (such as in digital television, digital radio and in audio and video files). Examples of audio coding ...
in modern VoIP technology. In the following time span of about two decades, various forms of packet telephony were developed and industry interest groups formed to support the new technologies. Following the termination of the ARPANET project, and expansion of the
Internet The Internet (or internet) is the global system of interconnected computer networks that uses the Internet protocol suite (TCP/IP) to communicate between networks and devices. It is a '' network of networks'' that consists of private, pub ...
for commercial traffic, IP telephony was tested and deemed infeasible for commercial use until the introduction of VocalChat in the early 1990s and then in Feb 1995 the official release of Internet Phone (or iPhone for short) commercial software by
VocalTec VocalTec Communications Inc. is an Israeli telecom equipment provider. The company was founded in 1985 by Alon Cohen and Lior Haramaty, who patented the first Voice over IP audio transceiver. VocalTec has supplied major customers such as Deuts ...
, based on th
Audio Transceiver
patent by
Lior Haramaty Lior Haramaty (born in Tel-Aviv, Israel in 1966) is the co-founder of VocalTec Inc. (1989) and the inventor of the Audio Transceiver used in the creation of Voice Over Networks products and eventually the VoIP Voice over Internet Protocol (VoIP), ...
and
Alon Cohen Alon Cohen (( he, אלון כהן); born in Israel, 1962) is the co-founder of VocalTec Inc. (1989) and the co-inventor of the Audio Transceiver () that enabled the creation of Voice Over Networks products and eventually the VoIP industry. Cohen ...
, and followed by other VoIP infrastructure components such as telephony gateways and switching servers. Soon after it became an established area of interest in commercial labs of the major IT concerns. By the late 1990s, the first
softswitch A softswitch (''software switch'') is a call-switching node in a telecommunications network, based not on the specialized switching hardware of the traditional telephone exchange, but implemented in software running on a general-purpose computing ...
es became available, and new protocols, such as
H.323 H.323 is a recommendation from the ITU Telecommunication Standardization Sector (ITU-T) that defines the protocols to provide audio-visual communication sessions on any packet network. The H.323 standard addresses call signaling and control, m ...
, MGCP and the
Session Initiation Protocol The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telepho ...
(SIP) gained widespread attention. In the early 2000s, the proliferation of high-bandwidth always-on Internet connections to residential dwellings and businesses, spawned an industry of Internet telephony service providers (ITSPs). The development of open-source telephony software, such as
Asterisk PBX Asterisk is a software implementation of a private branch exchange (PBX). In conjunction with suitable telephony hardware interfaces and network applications, Asterisk is used to establish and control telephone calls between telecommunication e ...
, fueled widespread interest and entrepreneurship in voice-over-IP services, applying new Internet technology paradigms, such as
cloud service Cloud computing is the on-demand availability of computer system resources, especially data storage (cloud storage) and computing power, without direct active management by the user. Large clouds often have functions distributed over multip ...
s to telephony. In 1999, a discrete cosine transform (DCT)
audio data compression In information theory, data compression, source coding, or bit-rate reduction is the process of encoding information using fewer bits than the original representation. Any particular compression is either lossy or lossless. Lossless compression ...
algorithm called the
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT) was adopted for the
Siren Siren or sirens may refer to: Common meanings * Siren (alarm), a loud acoustic alarm used to alert people to emergencies * Siren (mythology), an enchanting but dangerous monster in Greek mythology Places * Siren (town), Wisconsin * Siren, Wisc ...
codec, used in the
G.722.1 G.722.1 is a licensed royalty-free ITU-T standard audio codec providing high quality, moderate bit rate (24 and 32 kbit/s) wideband (50 Hz – 7 kHz audio bandwidth, 16 ksps (kilo- samples per second) audio coding. It is a partial imple ...
wideband audio Wideband audio, also known as wideband voice or HD voice, is high definition voice quality for telephony audio, contrasted with standard digital telephony "toll quality". It extends the frequency range of audio signals transmitted over telephone ...
coding standard. The same year, the MDCT was adapted into the LD-MDCT speech coding algorithm, used for the
AAC-LD The MPEG-4 Low Delay Audio Coder (a.k.a. AAC Low Delay, or AAC-LD) is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the ...
format and intended for significantly improved audio quality in VoIP applications. MDCT has since been widely used in VoIP applications, such as the
G.729.1 G.729.1 is an 8-32 kbit/s embedded speech and audio codec providing bitstream interoperability with G.729, G.729 Annex A and G.729 Annex B. Its official name is ''G.729-based embedded variable bit rate codec: An 8-32 kbit/s scalable wideband cod ...
wideband codec introduced in 2006,
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple fruit tree, trees are agriculture, cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, wh ...
's
FaceTime FaceTime is a Proprietary software, proprietary videotelephony product developed by Apple Inc. FaceTime is available on supported iOS mobile devices running iOS 4 and later and Mac computers that run and later. FaceTime supports any iOS devic ...
(using AAC-LD) introduced in 2010, the
CELT The Celts (, see pronunciation for different usages) or Celtic peoples () are. "CELTS location: Greater Europe time period: Second millennium B.C.E. to present ancestry: Celtic a collection of Indo-European peoples. "The Celts, an ancient ...
codec introduced in 2011,Presentation of the CELT codec
by Timothy B. Terriberry (65 minutes of video, see als
presentation slides
in PDF)
the
Opus ''Opus'' (pl. ''opera'') is a Latin word meaning "work". Italian equivalents are ''opera'' (singular) and ''opere'' (pl.). Opus or OPUS may refer to: Arts and entertainment Music * Opus number, (abbr. Op.) specifying order of (usually) publicatio ...
codec introduced in 2012, and
WhatsApp WhatsApp (also called WhatsApp Messenger) is an internationally available freeware, cross-platform, centralized instant messaging (IM) and voice-over-IP (VoIP) service owned by American company Meta Platforms (formerly Facebook). It allows us ...
's voice calling feature introduced in 2015.


Milestones

* 1966:
Linear predictive coding Linear predictive coding (LPC) is a method used mostly in audio signal processing and speech processing for representing the spectral envelope of a digital signal of speech in compressed form, using the information of a linear predictive model. ...
(LPC) proposed by
Fumitada Itakura is a Japanese scientist. He did pioneering work in statistical signal processing, and its application to speech analysis, synthesis and coding, including the development of the linear predictive coding (LPC) and line spectral pairs (LSP) methods. ...
of
Nagoya University , abbreviated to or NU, is a Japanese national research university located in Chikusa-ku, Nagoya. It was the seventh Imperial University in Japan, one of the first five Designated National University and selected as a Top Type university of T ...
and Shuzo Saito of
Nippon Telegraph and Telephone , commonly known as NTT, is a Japanese telecommunications company headquartered in Tokyo, Japan. Ranked 55th in Fortune Global 500, ''Fortune'' Global 500, NTT is the fourth largest telecommunications company in the world in terms of revenue, as w ...
(NTT). * 1973: Packet voice application by Danny Cohen. * 1974: The
Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers (IEEE) is a 501(c)(3) professional association for electronic engineering and electrical engineering (and associated disciplines) with its corporate office in New York City and its operation ...
(IEEE) publishes a paper entitled "A Protocol for Packet Network Interconnection". * 1974:
Network Voice Protocol The Network Voice Protocol (NVP) was a pioneering computer network protocol for transporting human speech over packetized communications networks. It was an early example of Voice over Internet Protocol technology. History NVP was first defi ...
(NVP) tested over
ARPANET The Advanced Research Projects Agency Network (ARPANET) was the first wide-area packet-switched network with distributed control and one of the first networks to implement the TCP/IP protocol suite. Both technologies became the technical fou ...
in August 1974, carrying barely audible 16kpbs
CVSD Continuously variable slope delta modulation (CVSD or CVSDM) is a voice coding method. It is a delta modulation with variable step size (i.e., special case of adaptive delta modulation), first proposed by Greefkes and Riemens in 1970. CVSD encode ...
encoded voice. * 1974: The first successful real-time conversation over ARPANET achieved using 2.4kpbs LPC, between Culler-Harrison Incorporated in
Goleta, California Goleta (; ; Spanish for "Schooner") is a city in southern Santa Barbara County, California, United States. It was incorporated as a city in 2002, after a long period as the largest unincorporated populated area in the county. As of the 2000 c ...
, and
MIT Lincoln Laboratory The MIT Lincoln Laboratory, located in Lexington, Massachusetts, is a United States Department of Defense federally funded research and development center chartered to apply advanced technology to problems of national security. Research and dev ...
in
Lexington, Massachusetts Lexington is a suburban town in Middlesex County, Massachusetts, United States. It is 10 miles (16 km) from Downtown Boston. The population was 34,454 as of the 2020 census. The area was originally inhabited by Native Americans, and was firs ...
. * 1977: Danny Cohen and Jon Postel of the USC
Information Sciences Institute The USC Information Sciences Institute (ISI) is a component of the University of Southern California (USC) Viterbi School of Engineering, and specializes in research and development in information processing, computing, and communications techno ...
, and
Vint Cerf Vinton Gray Cerf (; born June 23, 1943) is an American Internet pioneer and is recognized as one of " the fathers of the Internet", sharing this title with TCP/IP co-developer Bob Kahn. He has received honorary degrees and awards that include t ...
of the Defense Advanced Research Projects Agency (DARPA), agree to separate IP from TCP, and create UDP for carrying real-time traffic. * 1981:
IPv4 Internet Protocol version 4 (IPv4) is the fourth version of the Internet Protocol (IP). It is one of the core protocols of standards-based internetworking methods in the Internet and other packet-switched networks. IPv4 was the first version de ...
is described in RFC 791. * 1985: The
National Science Foundation The National Science Foundation (NSF) is an independent agency of the United States government that supports fundamental research and education in all the non-medical fields of science and engineering. Its medical counterpart is the National I ...
commissions the creation of
NSFNET The National Science Foundation Network (NSFNET) was a program of coordinated, evolving projects sponsored by the National Science Foundation (NSF) from 1985 to 1995 to promote advanced research and education networking in the United States. The p ...
. * 1985:
Code-excited linear prediction Code-excited linear prediction (CELP) is a linear predictive speech coding algorithm originally proposed by Manfred R. Schroeder and Bishnu S. Atal in 1985. At the time, it provided significantly better quality than existing low bit-rate algori ...
(CELP), a type of LPC algorithm, developed by Manfred R. Schroeder and Bishnu S. Atal. * 1986: Proposals from various standards organizations for Voice over ATM, in addition to commercial packet voice products from companies such as
StrataCom StrataCom, Inc. was a supplier of Asynchronous Transfer Mode (ATM) and Frame Relay high-speed wide area network (WAN) switching equipment. StrataCom was founded in Cupertino, California, United States, in January 1986, by 26 former employees of t ...
* 1991: Speak Freely, a voice-over-IP application, was released to the public domain. * 1992: The Frame Relay Forum conducts development of standards for Voice over Frame Relay. * 1992: InSoft Inc. announces and launches its desktop conferencing product Communique, which included VoIP and video. The company is credited with developing the first generation of commercial, US-based VoIP, Internet media streaming and real-time Internet telephony/collaborative software and standards that would provide the basis for the Real Time Streaming Protocol (RTSP) standard. * 1993 Release of VocalChat, a commercial packet network PC voice communication software from
VocalTec VocalTec Communications Inc. is an Israeli telecom equipment provider. The company was founded in 1985 by Alon Cohen and Lior Haramaty, who patented the first Voice over IP audio transceiver. VocalTec has supplied major customers such as Deuts ...
. *1994: MTALK, a freeware LAN VoIP application for
Linux Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which ...
* 1995:
VocalTec VocalTec Communications Inc. is an Israeli telecom equipment provider. The company was founded in 1985 by Alon Cohen and Lior Haramaty, who patented the first Voice over IP audio transceiver. VocalTec has supplied major customers such as Deuts ...
releases ''Internet Phone'' commercial Internet phone software. ** Beginning in 1995,
Intel Intel Corporation is an American multinational corporation and technology company headquartered in Santa Clara, California. It is the world's largest semiconductor chip manufacturer by revenue, and is one of the developers of the x86 seri ...
,
Microsoft Microsoft Corporation is an American multinational technology corporation producing computer software, consumer electronics, personal computers, and related services headquartered at the Microsoft Redmond campus located in Redmond, Washing ...
and
Radvision Radvision was a provider of video conferencing and telepresence technologies over IP and wireless networks based in Tel Aviv, Israel. It offered development and test suites for voice and video over IP communications. Radvision was acquired by A ...
initiated standardization activities for VoIP communications system. * 1996: **
ITU-T The ITU Telecommunication Standardization Sector (ITU-T) is one of the three sectors (divisions or units) of the International Telecommunication Union (ITU). It is responsible for coordinating standards for telecommunications and Information Commu ...
begins development of standards for the transmission and signaling of voice communications over Internet Protocol networks with the
H.323 H.323 is a recommendation from the ITU Telecommunication Standardization Sector (ITU-T) that defines the protocols to provide audio-visual communication sessions on any packet network. The H.323 standard addresses call signaling and control, m ...
standard. ** US telecommunication companies petition the US Congress to ban Internet phone technology. **
G.729 G.729 is a royalty-free narrow-band vocoder-based audio data compression algorithm using a frame length of 10 milliseconds. It is officially described as ''Coding of speech at 8 kbit/s using code-excited linear prediction'' speech coding (CS-ACEL ...
speech codec introduced, using CELP (LPC) algorithm. * 1997: Level 3 began development of its first
softswitch A softswitch (''software switch'') is a call-switching node in a telecommunications network, based not on the specialized switching hardware of the traditional telephone exchange, but implemented in software running on a general-purpose computing ...
, a term they coined in 1998. * 1999: ** The
Session Initiation Protocol The Session Initiation Protocol (SIP) is a signaling protocol used for initiating, maintaining, and terminating communication sessions that include voice, video and messaging applications. SIP is used in Internet telephony, in private IP telepho ...
(SIP) specification RFC 2543 is released. ** Mark Spencer of
Digium Digium, Inc. is a communications technology company based in Huntsville, Alabama, and since 2018, a subsidiary of Sangoma Technologies Corporation. The company makes VoIP business phone systems, IP phones, and hardware products. It was founded ...
develops the first
open source Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
private branch exchange A business telephone system is a multiline telephone system typically used in business environments, encompassing systems ranging in technology from the key telephone system (KTS) to the private branch exchange (PBX). A business telephone syst ...
(PBX) software (
Asterisk The asterisk ( ), from Late Latin , from Ancient Greek , ''asteriskos'', "little star", is a typographical symbol. It is so called because it resembles a conventional image of a heraldic star. Computer scientists and mathematicians often voc ...
). ** A discrete cosine transform (DCT) variant called the
modified discrete cosine transform The modified discrete cosine transform (MDCT) is a transform based on the type-IV discrete cosine transform (DCT-IV), with the additional property of being lapped transform, lapped: it is designed to be performed on consecutive blocks of a larger ...
(MDCT) is adopted for the
Siren Siren or sirens may refer to: Common meanings * Siren (alarm), a loud acoustic alarm used to alert people to emergencies * Siren (mythology), an enchanting but dangerous monster in Greek mythology Places * Siren (town), Wisconsin * Siren, Wisc ...
codec, used in the
G.722.1 G.722.1 is a licensed royalty-free ITU-T standard audio codec providing high quality, moderate bit rate (24 and 32 kbit/s) wideband (50 Hz – 7 kHz audio bandwidth, 16 ksps (kilo- samples per second) audio coding. It is a partial imple ...
wideband audio Wideband audio, also known as wideband voice or HD voice, is high definition voice quality for telephony audio, contrasted with standard digital telephony "toll quality". It extends the frequency range of audio signals transmitted over telephone ...
coding standard. ** The MDCT is adapted into the LD-MDCT algorithm, used in the
AAC-LD The MPEG-4 Low Delay Audio Coder (a.k.a. AAC Low Delay, or AAC-LD) is audio compression standard designed to combine the advantages of perceptual audio coding with the low delay necessary for two-way communication. It is closely derived from the ...
standard. * 2001:
INOC-DBA The INOC-DBA (Inter-Network Operations Center Dial-By- ASN) hotline phone system is a global voice telephony network that connects the network operations centers and security incident response teams of critical Internet infrastructure provid ...
, first inter-provider SIP network deployed; also first voice network to reach all seven continents. * 2003: First released in August 2003,
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
was the creation of Niklas Zennström and Janus Friis, in cooperation with four Estonian developers. It quickly became a popular program that helped democratise VoIP. * 2004: Commercial VoIP service providers proliferate. * 2006:
G.729.1 G.729.1 is an 8-32 kbit/s embedded speech and audio codec providing bitstream interoperability with G.729, G.729 Annex A and G.729 Annex B. Its official name is ''G.729-based embedded variable bit rate codec: An 8-32 kbit/s scalable wideband cod ...
wideband codec introduced, using MDCT and CELP (LPC) algorithms. * 2007: VoIP device manufacturers and sellers boom in Asia, specifically in the Philippines where many families of overseas workers reside. * 2009:
SILK Silk is a natural protein fiber, some forms of which can be woven into textiles. The protein fiber of silk is composed mainly of fibroin and is produced by certain insect larvae to form cocoons. The best-known silk is obtained from the coc ...
codec introduced, using LPC algorithm,Audio-Mitschnitt
vom Treffen der IETF-Codec-Arbeitsgruppe auf der Konferenz IETF79 in Peking, China mit einer Darstellung der grundlegenden Funktionsprinzipien durch Koen Vos (MP3, ~70 MiB)
and used for voice calling in
Skype Skype () is a proprietary telecommunications application operated by Skype Technologies, a division of Microsoft, best known for VoIP-based videotelephony, videoconferencing and voice calls. It also has instant messaging, file transfer, deb ...
. * 2010:
Apple An apple is an edible fruit produced by an apple tree (''Malus domestica''). Apple fruit tree, trees are agriculture, cultivated worldwide and are the most widely grown species in the genus ''Malus''. The tree originated in Central Asia, wh ...
introduces
FaceTime FaceTime is a Proprietary software, proprietary videotelephony product developed by Apple Inc. FaceTime is available on supported iOS mobile devices running iOS 4 and later and Mac computers that run and later. FaceTime supports any iOS devic ...
, which uses the LD-MDCT-based AAC-LD codec. * 2011: ** Rise of
WebRTC WebRTC (Web Real-Time Communication) is a free and open-source project providing web browsers and mobile applications with real-time communication (RTC) via application programming interfaces (APIs). It allows audio and video communication to wor ...
technology which allows VoIP directly in browsers. **
CELT The Celts (, see pronunciation for different usages) or Celtic peoples () are. "CELTS location: Greater Europe time period: Second millennium B.C.E. to present ancestry: Celtic a collection of Indo-European peoples. "The Celts, an ancient ...
codec introduced, using MDCT algorithm. * 2012:
Opus ''Opus'' (pl. ''opera'') is a Latin word meaning "work". Italian equivalents are ''opera'' (singular) and ''opere'' (pl.). Opus or OPUS may refer to: Arts and entertainment Music * Opus number, (abbr. Op.) specifying order of (usually) publicatio ...
codec introduced, using MDCT and LPC algorithms.


See also

*
Audio over IP Audio over IP (AoIP) is the distribution of digital audio across an IP network such as the Internet. It is used increasingly to provide high-quality audio feeds over long distances. The application is also known as audio contribution over IP (ACI ...
*
Communications Assistance For Law Enforcement Act The Communications Assistance for Law Enforcement Act (CALEA), also known as the "Digital Telephony Act," is a United States wiretapping law passed in 1994, during the presidency of Bill Clinton (Pub. L. No. 103-414, 108 Stat. 4279, codified at 47 ...
*
Comparison of audio network protocols The following is a comparison of audio over Ethernet and audio over IP audio network protocols and systems. Notes References {{reflist, refs= {{cite web , title=Best Practices in Network Audio , publisher=Audio Engineering Society , year=2009 ...
*
Comparison of VoIP software This is a comparison of voice over IP (VoIP) software used to conduct telephone-like voice conversations across Internet Protocol (IP) based networks. For residential markets, voice over IP phone service is often cheaper than traditional public swi ...
*
Differentiated services Differentiated services or DiffServ is a computer networking architecture that specifies a mechanism for classifying and managing network traffic and providing quality of service (QoS) on modern IP networks. DiffServ can, for example, be used t ...
* High bit rate audio video over Internet Protocol *
Integrated services In computer networking, integrated services or IntServ is an architecture that specifies the elements to guarantee quality of service (QoS) on networks. IntServ can for example be used to allow video and sound to reach the receiver without interrup ...
*
Internet fax Internet fax, e-fax, or online fax is the use of the internet and internet protocols to send a fax (facsimile), rather than using a standard telephone connection and a fax machine. A distinguishing feature of Internet fax, compared to other Intern ...
* IP Multimedia Subsystem *
List of VoIP companies is a list of notable companies providing voice over Internet Protocol (VoIP) services. {, class="wikitable sortable" , - ! Company !! Base of operations !! Country !! Services provided , - , EMAXX , , Concord, Ontario , , Canada , , A-Z v ...
*
Mobile VoIP Mobile VoIP or simply mVoIP is an extension of mobility to a voice over IP network. Two types of communication are generally supported: cordless telephones using DECT or PCS protocols for short range or campus communications where all base stati ...
*
Network Voice Protocol The Network Voice Protocol (NVP) was a pioneering computer network protocol for transporting human speech over packetized communications networks. It was an early example of Voice over Internet Protocol technology. History NVP was first defi ...
*
RTP payload formats The Real-time Transport Protocol (RTP) specifies a general-purpose data format and network protocol for transmitting digital media streams on Internet Protocol (IP) networks. The details of media encoding, such as signal sampling rate, frame size an ...
*
SIP trunking SIP trunking is a voice over Internet Protocol (VoIP) technology and streaming media service based on the Session Initiation Protocol (SIP) by which Internet telephony service providers (ITSPs) deliver telephone services and unified communications t ...
*
UNIStim UNIStim (or Unified Networks IP Stimulus) is a deprecated Telecommunications protocol developed by Nortel (now acquired by Avaya) for IP Phone (terminals and soft phones) and IP PBX communications. Most manufacturers of IP PBX equipment (Aastr ...
*
VoIP VPN A VoIP VPN combines voice over IP and virtual private network technologies to offer a method for delivering secure voice. Because VoIP transmits digitized voice as a stream of data, the VoIP VPN solution accomplishes voice encryption quite simply, ...
*
VoiceXML VoiceXML (VXML) is a digital document standard for specifying interactive media and voice dialogs between humans and computers. It is used for developing audio and voice response applications, such as banking systems and automated customer service ...
*
VoIP recording Voice over Internet Protocol (VoIP) recording is a subset of telephone recording or voice logging, first used by call centers and now being used by all types of businesses. There are many reasons for recording voice over IP call traffic such as: ...


Notes


References


External links

* * {{DEFAULTSORT:Voice over IP Broadband Videotelephony Audio network protocols Office equipment